Beyond LINQ: A Manifesto For Distributed Data-Intensive Programming

The LINQ project as embodied by C# 3.0 and Visual Basic 9 brings concepts from functional programming such as type-inference, lambda-expressions, and most importantly monad comprehensions into mainstream object-oriented programming. This is a definitively exciting for the programming language community, but realistically, it is just a tiny step towards democratizing building distributed data-intensive applications. To merely approach that goal there is still much work to do in (at least) the following areas:

Tools and IDE
It is fair to say that the the days of writing code using a text editor and batch compiler are over. Visual Studio, Eclipse, and Emacs are the norm rather than the exception. However, whenever you meet an (ex)-Smalltalk or VB6 programmer, they reminiscence the highly interactive development environment, scripters cannot live without their REPL, and tools like Ruby on Rails disrupt traditional development because of its simplicity and quick turn-around time. To simplify programming for the masses we need to shake of the yoke of the dreaded (edit, compile, run, debug)* loop and replace it with a lightweight (edit=compile=run=debug) experience.
Language and Type Systems
Writing distributed data-intensive applications naturally means dealing with many forms of data, relational, XML, objects, typed, semi-structured, or untyped. Current languages are not well equipped to deal any scrap of data, and much language and type-system innovation is required such explicit relationships, contracts,  layered type-systems, seamlessly dealing with both static and dynamic typing in the same program, extensibility, etc. The challenge is to package advanced ideas from programming language research in such a way that you do not need a PhD in type-theory to understand them.
Runtime and Libraries
Sometimes, and preferably, the compiler is able map all language and type extensions to an existing runtime such as the JVM or the CLR. However, often this is not feasible and we need to extend the runtime infrastructure to accommodate these new features. A prime example is the support for generics in the CLR versus the JVM, other examples include efficient support for first class continuations, query execution, etc. Obviously, a dynamic language is ultimately all about how dynamic the underlying runtime infrastructure is unless you emulate the dynamic features of the language, which kills interoperability. In addition to runtime support, new language and type-system features often need extensive library and infrastructure support. For example, in the context of LINQ, the language extensions are just the tip of the iceberg, and in fact the bulk of the work is in the libraries such as XLinq and in particular the OR-mapping infrastructure.
Transactions Everywhere
To make programming web services accessible to the masses, we need have to have a comprehensible way to deal with concurrency. In addition, on the desktop itself we need some way to harness the upcoming multi-core revolution. We believe that transactions are the most promising approach in this space. In fact transaction are the only way ordinary people can deal with concurrency, unless of course you are a sadomasochist who likes to wear black leather and play with locks.

As you can imagine, this is a lot of work and it will keep us language geeks off the streets for a long, long, time! And in case you are currently wandering the streets looking for a job as a compiler writer, virtual machine hacker, tool smith, etc. drop me an email. We have several job openings available.

A Formal Language for Analyzing Contracts

By pure serendipity I just stumbled upon this proposal, on Nick Szabo's website; it should appeal to those who found the earlier stories on Lexifi and the Composing Contracts paper interesting. It is not a fresh story (2002) but AFAIK it has not been mentioned here before; also, there is as yet no implementation.

Wadler's Blog: Penn, PADL, POPL, and Plan-X

I spent 5-15 January visiting U Penn and attending PADL, POPL, and Plan-X in Charleston, SC...

Interesting trip report.

I encourage you to lure Philip into a LtU discussion...

Modeling Genome Evolution with a DSEL for Probabilistic Programming sounds interesting, I'll have to look it up.

And talking about possible applications for Links, doesn't building a Google-Web Services APIs-AJAX DSL sound like a cool application for Links? Think about it as the easiest way to program an AJAX applications based on web services APIs, and automatically integrated into the Google universe (think Google maps etc.) If anyone from Google is reading this - this might be a cool "20 percent time" project... I won't go on since I am sure you can all imagine the possibilities (e.g., think about it is a scripting language for writing widgets for a future incarnation of a "Google Pack").

Haskell is not not ML

Haskell is not not ML. Ben Rudiak-Gould, Alan Mycroft, and Simon Peyton Jones. European Symposium on Programming 2006 (ESOP'06).

We present a typed calculus IL ("intermediate language") which supports the embedding of ML-like (strict, eager) and Haskell-like (non-strict, lazy) languages, without favoring either. IL's type system includes negation (continuations), but not implication (function arrow). Within IL we find that lifted sums and products can be represented as the double negation of their unlifted counterparts. We exhibit a compilation function from IL to AM --- an abstract von Neumann machine --- which maps values of ordinary and doubly negated types to heap structures resembling those found in practical implementations of languages in the ML and Haskell families. Finally, we show that a small variation in the design of AM allows us to treat any ML value as a Haskell value at runtime without cost, and project a Haskell value onto an ML type with only the cost of a Haskell deepSeq. This suggests that IL and AM may be useful as a compilation and execution model for a new language which combines the best features of strict and non-strict functional programming.

The authors start from the claim that most of the differences between SML and Haskell are independent of evaluation order. Is it possible, they wonder, to design a hybrid language which in some way abstracts over possible evaluation orders?

This papers leaves the language design for future work, and concentrates on the implementation costs. The results seem positive, so one hopes this project will mature and end the civil war between lazy and eager functional programming...

More information on this project is likely to appear here.

Haskell vs. Erlang, Reloaded

The goal of my project was to be able to thoroughly test a poker server using poker bots. Each poker bot was to to excercise different parts of the server by talking the poker protocol consisting of 150+ binary messages. The poker server itself is written in C++ and runs on Windows....

This app is all about binary IO, thousands of threads/processes and easy serialization. All I ever wanted to do was send packets back and forth, analyze them and have thousands of poker bots running on my machine doing same. Lofty but simple goal :-). Little did I know!

Erlang and Haskell compared... Want to know the conclusion?

I was able to finish the Erlang version 10 times faster and with 1/2 the code. Even if I cut the 10-11 weeks spent on the Haskell version in half to account for the learning curve, I would still come out way ahead with Erlang.

I am sure you'll find a lot to disagree with in this article...

Infrastructure Announcement

Later today we will be upgrading the Drupal software used to run LtU, and migrating to a new server.

Hopefully, this move will ensure that we have fewer outages and better performance.

We will not be rolling out any major new features today, but the Drupal upgrade allows us to solve some of the problems you've all be complaining about, and we will be offering new features over time.

When we are stable again, and once we all have a chance to catch our breath, we'll get around to adding the Wiki component we've been dreaming about for some time (aka the LtU-opedia...).

The first step, however, is ensuring today's move goes according to plan. Anton is on the case, so we don't have a lot to worry about. If you experience any problems with LtU once the server move is complete, please let us know ASAP.

Be advised that for a short period of time posting to the site will be disabled.

Semantic Distance: NLP Not a Resource Sink

Following the story on Mind Mappers and other RDF comments of late, I thought this NLP slide show (PDF) should get a story. Dr. Adrian Walker offers an interesting perspective in a friendly crayon-colored format, including a critique of RDF. Source site Internet Business Logic has other offerings including an online demo.

Rebol - Dialects, Spreadsheets

Gregg Irwin just sent an interesting email (about Rebol) to the pragprog mailing list. I can't work out how to access the Yahoo archives so instead I'll post chunks here. But first I'll give some background links (there's also two links at the very end of this post to a two part article on an implementation of a spreadsheet in Rebol - I'm not sure how the cells communicate, but that might be interesting):

  1. Rebol dialects - blocks that carry condensed meaning through the use of a different grammar (ordering) of values and words. Dialects are usually unique and well-suited to the problems they are designed to solve. (DSLs)
  2. Tutorial
  3. License - a bit odd/worrying (how can you not modify a language that expressly encourages DSLs?)

Fragments of the email inside...

Lisp is sin

People are discussing this blog post all over, so we might as well mention it here.

The discussion is quite balanced, though you are likely to disagree about the specifics. One issue, discussed here repeatedly, is that code=data doesn't require S-expressions. Lisp expressiveness runs deeper than that.

Our discussion of Spolsky's Java Schools essay is here, by the way.

Spring School on Datatype-Generic Programming 2006

If you are interested in generic programming and have some free time in April, this is for you.

LtU readers will recognize the names of the lecturers, if not the specific presentation titles. Among the lecturers are Jeremy Gibbons, Ralf Hinze, Ralf Lämmel and Tim Sheard (who will talk about putting Curry-Howard to work).